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ABSTRACT 

Chromatin immunoprecipitation identified 191 bind- 
ing sites of Mycobacterium tuberculosis cAMP re- 
ceptor protein (CRP Mt ) at endogenous expression 
levels using a specific a-CRP Mt antibody. Under 
these native conditions an equal distribution be- 
tween intragenic and intergenic locations was ob- 
served. CRP Mt binding overlapped a palindromic con- 
sensus sequence. Analysis by RNA sequencing re- 
vealed widespread changes in transcriptional profile 
in a mutant strain lacking CRP Mt during exponential 
growth, and in response to nutrient starvation. Differ- 
ential expression of genes with a CRP Mt -binding site 
represented only a minor portion of this transcrip- 
tional reprogramming with -19% of those represent- 
ing transcriptional regulators potentially controlled 
by CRP Mt . The subset of genes that are differentially 
expressed in the deletion mutant under both culture 
conditions conformed to a pattern resembling canon- 
ical CRP regulation in Escherichia coli, with bind- 
ing close to the transcriptional start site associated 
with repression and upstream binding with activa- 
tion. CRP Mt can function as a classical transcription 
factor in M. tuberculosis, though this occurs at only 
a subset of CRP Mt -binding sites. 

INTRODUCTION 

The success of Mycobacterium tuberculosis as one of the 
most deadly human pathogens depends on its ability to 
adapt to diverse intracellular and extracellular environ- 
ments encountered during infection, persistence and trans- 
mission [reviewed in (1)]. This is mediated in part by an 



extensive repertoire of transcriptional regulators, includ- 
ing alternative sigma factors, two-component signal trans- 
duction proteins and serine-threonine protein kinase sen- 
sors (2). Denning the scope of individual regulators and 
their participation in integrated regulatory networks gen- 
erates insights into the in vivo physiology of M. tuberculosis 
that will assist in the selection of optimal treatment strate- 
gies (3). A combination of chromatin immunoprecipitation 
(ChIP) with sequence-based transcriptional profiling pro- 
vides a powerful approach to this goal. Whilst some my- 
cobacterial transcription factors display a restricted profile 
of binding to a limited set of regulated promoters (4-6), a 
recent study of EspR revealed a much broader profile resem- 
bling that of nucleoid-associated proteins (NAPs) involved 
in structural organization of the chromosome (7,8). High- 
throughput ChIP experiments based on overexpressed tran- 
scription factors in M. tuberculosis systematically detect a 
wide repertoire of low affinity binding sites, however, sug- 
gesting that there may be no clear-cut distinction between 
proteins with localized and generalized binding profiles (9). 
In the present study, we set out to define the genomic bind- 
ing profile of cAMP (cyclic adenosine monophosphate) re- 
ceptor protein (CRP Mt ) of M. tuberculosis with a particular 
emphasis on the localization of CRP Ml -binding sites rela- 
tive to transcription start sites (TSSs). 

CRP is one of the best-characterized transcription fac- 
tors in the model organism, Escherichia coli, with a role in 
regulation of around 200 transcriptional units. CRP binds 
to a palindromic sequence (TGTGAN 6 TCACA) in the pro- 
moters of target genes and, depending on the distance be- 
tween the binding site and the transcriptional start site, en- 
hances or restricts recruitment of ribonucleic acid (RNA) 
polymerase [reviewed in (10,11)]. In addition to binding to 
target promoters, ChlP-chip analysis of E. coli CRP uncov- 
ered an extensive background pattern of low affinity sites 
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suggesting that CRP may have an additional role as a more 
general chromosomal organizer (12). A key feature of E. coli 
CRP is that its affinity for deoxyribonucleic acid (DNA) is 
strongly dependent on binding of the cAMP ligand, allow- 
ing it to play a central role in the global coordination of 
transcriptional reprogramming required for optimal utiliza- 
tion of different carbon substrates. 

The corresponding M. tuberculosis CRP Mt , encoded by 
Rv3676, recognizes a similar binding motif and has been 
shown to regulate transcription of several promoters — for 
example, the serC promoter (13), the rpfA promoter (14), 
the whiBl promoter (15,16) and the frdA promoter (17). 
Computational approaches and in vitro DNA-binding stud- 
ies suggest that M. tuberculosis CRP Mt has multiple targets 
and is likely to share the global profile of the E. coli ho- 
mologue (18-20). A significant difference between the two 
proteins is that cAMP binds to M. tuberculosis CRP Mt with 
weak affinity and has less effect on its binding to DNA 
[(21), for review see (22)]. This may reflect differences in 
the abundance and regulation of cAMP in mycobacteria 
(23,24). In contrast to the single enzyme in E. coli, M. tu- 
berculosis has 17 genes encoding adenylate cyclases (25,26), 
and the dynamics of c AMP synthesis and secretion are pro- 
posed to play an important role during infection [(27), for 
review see (23)]. Consistent with a role in pathogenesis, dele- 
tion of the crp gene results in significant impairment of in 
vitro growth of M. tuberculosis and attenuates virulence in a 
mouse model (14). We anticipated that comprehensive map- 
ping of the binding profile of M. tuberculosis CRP Ml would 
assist in characterization of this global regulator and con- 
tribute more broadly to a fundamental understanding of 
gene regulation in mycobacteria. 

MATERIALS AND METHODS 

Bacterial strains, plasmids and growth conditions 

The strains used were E. coli strain DH5ct, for all plas- 
mid construction, and M. tuberculosis strains H37Rv (wild 
type) and M. tuberculosis Acrp, an H37Rv mutant in which 
Rv3676 has been deleted (14). E. coli cultures were grown 
in Luria-Bertani broth and agar (15 g/1). Where needed, 
ampicillin and kanamycin were used at final concentrations 
of 100 and 50 |jig/ml, respectively. M. tuberculosis cultures 
were grown in Dubos broth containing 0.05% (v/v) Tween, 
Middlebrook 7H9 broth or Middlebrook 7H11 agar and 
supplemented with 0.5% (v/v) glycerol and 4% albumin. To 
monitor the response to nutrient starvation (28), M. tuber- 
culosis was grown to mid-exponential phase (OD 6 oo 0.6), the 
cells pelleted and washed twice in phosphate buffered saline 
(PBS) supplemented with 0.025% tyloxapol. Cells were then 
resuspended in 100 ml PBS plus tyloxapol and incubated 
without shaking at 37 °C. 

Chromatin immunoprecipitation 

ChIP was performed as previously described (29) with some 
modifications to the protocol. Rv3676 protein tagged with 
hexa-His at the N-terminus was purified in E. coli as de- 
scribed previously (30) and used to produce CRP Mt -specific 
polyclonal antibodies in rabbits. The primary dose (300- 
|xg protein) was administered subcutaneously in Freund's 



complete adjuvant, followed by two booster injections in 
Freund's incomplete adjuvant after 14 and 30 days. Sera 
were prepared and then stored at — 20° C until required. Im- 
munoglobulin G (IgG) was purified from the serum using 
T-Gel purification kit (Pierce) as per manufacturer's instruc- 
tions. Purified IgG was used for ChlP-seq analysis. Rv3597c 
antibody was raised by Cambridge Research Biochemicals, 
using a peptide antigen. 

M. tuberculosis H37Rv (wild type) and M. tuberculosis 
Acrp cells were grown in roller bottles to mid-exponential 
phase (OD 6 oo 0.6) and formaldehyde was added to a final 
concentration of 1%. After 10 min of incubation, glycine 
was added to a final concentration of 0.5 M to quench the 
reaction and incubated for a further 5 min. Cross-linked 
cells were harvested by centrifugation and washed twice 
with ice-cold Tris-buffered saline (TBS, pH 7.5). Cell pel- 
lets were resuspended in 4-ml immunoprecipitation (IP) 
buffer [50-mM HEPES-KOH (pH 7.5), 150-mM NaCl, 1- 
mM ethylenediaminetetraacetic acid (EDTA), 1% Triton X- 
100, 0.1% (w/v) sodium deoxycholate, 0.1% sodium dode- 
cyl sulphate (SDS), 0.1-mg/ml RNase A and one protease 
inhibitor cocktail tablet (Roche)]. Cells were lysed and the 
DNA sheared to an average size of ~300 base pairs (bp) us- 
ing a Bioruptor (Diagenode) with 40 cycles of 30-s on/off 
at high setting. Insoluble matter was removed by centrifu- 
gation for 10 min at 4°C, and the supernatant was split into 
two 1.8-ml aliquots. The remaining ~400 |xl was kept to 
check the size of the DNA fragments and for sequencing 
to remove any sequencing bias (input). 

Each 1.8-ml aliquot was incubated with 20-|xl Protein 
A/G UltraLink Resin (Pierce) on a rotary shaker for 45 
min at room temperature to eliminate complexes bound 
to the resin non-specifically. The supernatant was then re- 
moved and incubated with either no antibody (mock-IP), 
specific a-CRP Mt antibody raised against purified CRP Mt 
(Rv3676) protein (Supplementary Figure SI) and 50-jjlI Pro- 
tein A/G UltraLink Resin, pre-incubated with 1-mg/ml 
bovine serum albumin in TBS, on a rotary shaker at 4°C 
overnight. Samples were washed once with IP buffer, twice 
with IP buffer containing 500-mM NaCl, once with wash 
buffer [10-mM Tris (pH 8.0), 250-mM LiCl, 1-mM EDTA, 
0.5% Tergitol (Sigma) and 0.5% sodium deoxycholate] and 
once with TE (pH 7.5). Immunoprecipitated complexes 
were eluted from the resin in IOO-julI elution buffer [10-mM 
Tris (pH 7.5), 10-mM EDTA and 1% SDS] at 65°C for 30 
min. Immunoprecipitated samples and the input DNA were 
de-cross-linked in 0.5x elution buffer containing 0.8-mg/ml 
Pronase at 42°C for 2 h followed by 65°C for 6 h. DNA was 
purified using a polymerase chain reaction (PCR) purifi- 
cation kit (QIAGEN). Prior to library preparation and se- 
quencing, the DNA fragment sizes were checked by agarose 
gel electrophoresis and gene-specific quantitative PCR was 
carried out. 

RNA isolation 

RNA was extracted from exponentially growing wild-type 
and Acrp M. tuberculosis H37Rv in Middlebrook 7H9 me- 
dia, as previously described (3). Briefly, 25 ml of mid-log 
phase cultures were pelleted at 4°C, and RNA was isolated 
using the FastRN A Pro Blue kit (MP Bio), according to the 
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manufacturer's guidelines. Following extraction, RNA was 
treated with Turbo DNase (Ambion) to degrade all DNA 
present, and the quality and integrity were assessed using a 
Nanodrop (ND-1000, Labtech) and Agilent bioanalyzer. 

Library construction and Illumina sequencing 

For ChlP-seq, prior and post library construction, the con- 
centration of the immunoprecipitated DNA samples was 
measured using the Qubit HS DNA kit (Invitrogen). Li- 
brary construction and sequencing were performed using 
the ChlP-seq Sample Prep kit, Reagent Preparation kit and 
Cluster Station kit (Illumina). Samples were sequenced on 
an Illumina Genome Analyzer IIx (GAIIx) instrument and 
loaded at a concentration of 10 pM. For RNA sequenc- 
ing (RNA-seq), strand -specific cDNA libraries were con- 
structed using the Illumina Small RNA Sample preparation 
kit with the v 1.5 small RNA (sRNA) 3' Adaptor and se- 
quenced on a GAIIx (Illumina). 

Data analysis 

Sequence reads were aligned to the reference sequence of M. 
tuberculosis H37Rv (EMBL accession number AL123456) 
as single-end data using the Burrows-Wheeler aligner 
(BWA) (31). The genome coverage was calculated us- 
ing Samtools (32) and visualized in the Artemis genome 
browser. For ChlP-seq, peaks were called using a com- 
bination of CisGenome (33) and the BayesPeak pack- 
age in R/Bioconductor (34). For RNA-seq, differential 
gene expression was analysed using the DESeq package in 
R/Bioconductor (35). For evaluating the location of bind- 
ing sites as potential transcriptional activators or repres- 
sors, a region between —200 and +20 bp with respect to 
the transcriptional start site was considered. For functional 
enrichment analysis, the 87 genes with CRP-binding sites 
up to 200 bp upstream from the annotated TSS (36) were 
analysed. All functional categories were assigned using Tu- 
berculist annotations. GraphPad Prism v6.04 was used to 
compare the frequencies of different functional categories in 
respect to the H37Rv genome distribution using two-tailed 
Chi-square tests. 

Quantitative real-time PCR 

Quantitative real-time PCR (qRT-PCR) was used to deter- 
mine whether the ChIP experiment had worked prior to li- 
brary construction and to validate both the ChlP-seq and 
RNA-seq data. To measure the enrichment of TF-binding 
targets in the immunoprecipitated DNA samples, 1 julI of IP 
or mock-IP DNA was used with Quantitect SYBR Green 
(QIAGEN) together with specific primers to the upstream 
region of Rv3616c, known to be bound by CRP Mt . To vali- 
date the RNA-seq data, specific primers were used to quan- 
tify the messenger RNA targets showing up- or downreg- 
ulation, and control targets showing no differential expres- 
sion. RNA was extracted as described above, and 30 ng total 
RNA from wild-type and Acrp cells was used with the Ex- 
press One-Step SYBR GreenER kit (Invitrogen) according 
to the manufacturer s guidelines. All primer sequences and 
qRT-PCR results are in Supplementary Tables SI and S2. 



cAMP measurement 

Samples for cAMP measurement during the starvation re- 
sponse were taken at time points 0 h, 24 h, 48 h and 96 
h. At each time point, 2 ml of culture was spun down, re- 
suspended in 0.1 -M HC1 and boiled for 10 min. The sam- 
ples were lysed in the presence of glass beads (150-212 |jum; 
Sigma) using a Ribolyser (Hybaid) at a speed setting of 6.0 
for 40 s. The supernatant was collected by centrifugation 
and stored at — 20°C until the assay was performed. Levels 
of c AMP in the cells were measured using the Direct cAMP 
Enzyme Immunoassay kit (Sigma), following the acetylated 
version according to the manufacturer s guidelines, and nor- 
malized to the total amount of protein in the samples (mea- 
sured using a Nanodrop ND-1000). 

RESULTS 

Genomic mapping of the CRP Mt -binding profile 

The aim of this work was to investigate CRP Mt binding 
to the M. tuberculosis chromosome by ChIP combined 
with high-throughput sequencing (ChlP-seq) and to inte- 
grate these data with transcriptional profiling by RNA- 
seq. ChlP-seq was done using a specific antibody against 
CRP Mt , thus enabling us to study the binding under native 
conditions without the need to tag and overexpress the pro- 
tein. Under these native conditions, CRP Mt is expressed at 
high levels in the cell; based on published quantitative mass 
spectrometry and electron microscopy (36,37) and western 
blot analysis we estimate the number of CRP Mt molecules 
to be approximately stoichiometric to the number of ribo- 
somes per cell (~3500). 

We were able to map 98% of the sequences uniquely to 
the H37Rv genome (allowing for up to two mismatches per 
read) and achieved a near-complete representation of the 
entire genome (98% of the genome was mapped). The re- 
maining 2% of the genome includes PE/PPE genes, which 
contain highly repetitive sequences that are poorly resolved 
by short-read sequencing. To visualize the genome cov- 
erage, the number of reads mapping to each position on 
the M. tuberculosis chromosome was calculated and the 
traces visualized in the Artemis genome browser. Peaks 
were called using the CisGenome software (33) to identify 
enriched regions in the CRP Mt -IP compared to the mock-IP 
(performed in the absence of antibody) and input (sheared 
genomic DNA). To validate the results, the data were also 
analysed using the BayesPeak package in R/Bioconductor, 
and peaks called in only one of the two methods were dis- 
carded. As seen in Figure 1A, there was no significant en- 
richment of any regions of the M. tuberculosis chromo- 
some in the mock-IP (or the input; data not shown) indi- 
cating negligible non-specific binding to the resin or the an- 
tibody. In the CRP Mt -IP, however, 191 peaks were found, 
denoting CRP Mt -binding sites (here abbreviated to CBSs) 
on the chromosome (Figure 1A). The average length of 
the CBSs was 276 bp, and in total this represents 0.7% of 
the entire M. tuberculosis genome. No differences were ob- 
served in CRP Mt -DNA binding between cells grown in Du- 
bos medium and cells grown in Middlebrook 7H9 medium 
(data not shown). No ChlP-seq signals were detected for the 
cr/;-deletion strain. 
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Figure 1. ChlP-seq mapping of CRP Mt -binding sites. (A) Sequence reads 
from ChIP are mapped onto the M. tuberculosis H37Rv genome and dis- 
played using the Artemis browser. No peak enrichment was observed in a 
mock-IP sample (red trace). One hundred and ninety one peaks were iden- 
tified in the IP sample using anti-CRP Mt antibody (black trace). (B) A con- 
sensus motif resembling that defined for E. coli CRP was identified within 
50 bp of the bound centre for 97% of the 191 of ChlP-seq peaks. (C) A clear 
correlation was observed between peak enrichment and match to the con- 
sensus motif. (D) Representative Artemis profiles of CBSs. CRP Mt bind- 
ing (i) to the divergent promoter region between Rv0078A and Rv0079; (ii) 
in the intragenic region within Rvl592c; and (iii) overlapping the sRNA 
ncRvl3943. Red boxes denote the CBS and blue boxes denote the TSS. 



De novo motif discovery by MEME-ChIP (38), using 50 
bp upstream and downstream of the centre of each peak, 
identified a consensus motif present in 97% of the 191 bind- 
ing sites that is similar to motifs previously predicted for 
M. tuberculosis CRP Mt and experimentally defined for E. 
coli CRP (Figure IB). The midpoint of each ChIP peak was 
compared to the centre of the CRP Mt -binding site, as pre- 
dicted from the consensus sequence, and the average differ- 
ence was found to be 6 bp, with a correlation coefficient 
of 0.995 between the two data sets. At 14 of the sites, we 
identified more than one copy of the consensus motif. At 
three of the sites we were unable to identify a match to the 
consensus motif. The enrichment of the DNA fragments in 
the CRP Mt -IP compared to the mock-IP was inversely pro- 
portional to the number of mismatches found at each site 
(Figure 1C). The top 14 sites, with an enrichment factor 
(maxT) of 10 or more, are shown in Table 1 and representa- 
tive Artemis profiles are illustrated in Figure ID (for details 
of all sites, see Supplementary Table S3). 



cutoff, indicating that CRP Ml binds with similar affinity to 
both types of site. In 32 cases, the CRP Mt -binding site was 
located between divergently transcribed gene pairs; this is 
proportional to the genome average of 16% of all genes with 
divergent orientation in M. tuberculosis. CRP regulation of 
divergent gene pairs has also been observed in E. coli (39). 
In some instances, CBSs mapped to the intergenic region 
between convergent gene pairs, like Rv2866 and Rv2867c 
and Rv2451 and Rv2452c. Three CRP Mt -binding sites also 
mapped upstream of the sRNAs, ncRvl3843, ncRvll373 
and ncRvl3660c (40). 

Canonically positioned CBSs are associated with functional 
categories 

To define the precise location of CBSs with respect to tran- 
scriptional start sites (TSSs), we integrated the ChIP data 
set with a M. tuberculosis TSS map generated by sequence 
analysis of 5'-triphosphate-enriched RNA libraries (36). 
The spacing between the midpoint of each CBS motif and 
adjacent primary TSSs is recorded in Supplementary Ta- 
ble S3. Including data from peaks with multiple motifs, and 
CBSs mapping to more than one TSS, we measured a to- 
tal of 242 CBS-TSS pairs; in 203 cases, the CBS was lo- 
cated within 500 bp of a TSS, 127 sites were upstream and 
76 downstream, 41 of which were between the TSS and the 
start codon (i.e. within the 5'-Untranslated Region -UTR-) 
and 35 within the coding sequence. Plotting of the distribu- 
tion of CBS-TSS spacing revealed clustering in the regions 
from -60 to -40, and from +1 to +20 (Figure 2A and B). 
Genes harbouring CRP-binding sites in the —200 to 0 re- 
gion were analysed for functional categories. Amongst the 
87 genes that contain a CRP site in a putative promoter 
region, genes involved in cell wall and cell processes were 
enriched in our data set compared to the H37Rv genome 
(Chi-square test, P = 0.021; Figure 2C). 

Several of the CBSs have also been identified as binding 
sites for other transcription factors, suggesting that CRP Mt 
may act in concert with other regulators. The promoter re- 
gion of Rvl057, for example, has binding sites for MprA, 
EspR and TrcR in addition to CRP Mt (7,41) (Figure 3). 
Additional promoter regions with binding sites for multi- 
ple transcription factors include fadD26 (Rv2930) with an 
EspR-binding site (7) and espA (Rv3616c) with binding sites 
for EspR, MprA and CRP (7,42-45). 



Location of CBSs 

Sixty nine of the CRP Ml -binding sites mapped uniquely to 
a location within a protein coding gene or stable RNA, with 
a possible role in long-distance regulation and/or chromo- 
some organization. Of the remaining sites, 86 CBSs mapped 
uniquely to intergenic loci corresponding to potential pro- 
moter regions, whilst 35 CBSs were located within a pro- 
tein coding sequence in a region that could also serve as 
the potential promoter of a downstream gene. This rep- 
resents a significant enrichment of intergenic regions over 
that predicted by chance, considering ~90% of the entire 
M. tuberculosis genome is intragenic. The distribution be- 
tween intragenic and intergenic locations remained approx- 
imately equal irrespective of the fold-enrichment used as 



Transcriptional regulation of CBS genes 

Previous microarray and targeted qRT-PCR analyses have 
demonstrated differential expression of CBS genes follow- 
ing deletion of the crp gene (14). Using an RNA-seq ap- 
proach to compare the transcriptional profile of wild-type 
and crp-deletion strains during exponential growth, we ob- 
served widespread changes in gene expression affecting 
more than 20% of the total transcriptome (Supplementary 
Table S4). Filtering based on an adjusted P-value of <0.05 
identified 453 genes with > 2-fold increased abundance in 
the knockout and 412 with >2-fold decrease. CBS genes 
comprised only a minor fraction of the differential expres- 
sion profile, with statistically significant upregulation of 37 
genes and downregulation of 15 genes. Forty eight per cent 
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Table 1. ChIP peaks with highest fold enrichment 
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a Enrichment factor (maxT) of the peaks between the CRP Mt -IP and the mock-IP as calculated by the CisGenome software. 

b Centre of the CRP Mt -binding site based on the consensus sequence (Figure IB). 

c Distance of the centre of the ChIP peak (bound centre) to the centre of the CRP Ml -binding site. 

d Distance of the centre of the CRP Mt -binding site to the TSS, as defined by (36) and documented in Supplementary Table S3. 



of the CBS-regulated genes corresponded to genes anno- 
tated as key metabolic enzymes or genes with predicted roles 
on transcription regulation that could amplify the CRP Mt 
regulatory signal (Supplementary Table S5). Fifty percent 
of the differential expression profile (212 of the upregulated 
genes and 21 1 of the downregulated genes) was snared with 
the response to nutrient starvation (Supplementary Table 
S4), and is likely to reflect secondary effects associated with 
the marked growth defect of the crp mutant. 



We anticipated that if CRP Mt was acting together with 
other transcription factors, differential expression of CBS 
genes may be enhanced under alternative growth condi- 
tions. ChlP-seq analysis revealed a general decrease in 
CRP Mt binding to DNA after incubation for 24 h in PBS 
(Figure 4A). Furthermore, a reduction in the amount of 
cAMP was observed in the nutrient starvation model (Fig- 
ure 4B). There was no significant change in CRP Mt pro- 
tein abundance in the starvation model (36). The major- 
ity of the ChlP-seq peaks identified in exponential culture 
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Figure 2. CBS distribution around transcriptional start sites. (A) CRP Mt sites in At. tuberculosis show a clustering in the region around TSSs (n = 203). (B) 
The distribution of CBS-TSS distances for M. tuberculosis CRP Mt sites is compared to a similar data set from E. coli (39). (C) Genes harbouring CRP Mt 
sites in the —220 to 0 region (n = 87) were analysed for functional categories according to Tuberculist annotations. Bar graphs represent the percentage 
of each functional class for CBS genes (black bars) compared to the distribution of these classes amongst all H37Rv genes (grey bars). Asterisks denote 
functional categories that are statistically significant after Chi-square test analyses. 



were also detected after starvation, though with a reduction 
in fold-enrichment and loss of 33 of the 76 peaks having 
an enrichment ratio of less than 5 in the exponential data 
set. The ChlP-seq peaks not identified after starvation were 
not enriched in any specific functional category and only 
53% of the CBS-associated genes in exponential culture 
showed downregulation during starvation. Comparison of 
the wild-type and mutant strains under starvation condi- 
tions revealed wide-ranging differences in the overall tran- 
script profile, with 361 genes having >2-fold higher abun- 
dance and 465 reduced abundance in the knockout (Supple- 
mentary Table S4), but again CBS genes made only a minor 
contribution, with 28 genes upregulated and 33 downregu- 
lated. 

Twenty nine CBS genes showing a concordant response 
in a comparison of wild-type and crp deletion strains un- 
der both culture conditions are shown in Table 2, ranked 
according to the distance between the CBS and the TSS. 
Whilst the number of differentially expressed genes is low, 
the results are consistent with the canonical E. coli model 
of CRP Mt binding close to the TSS inhibiting transcrip- 



tion and upstream binding enhancing transcription. There 
was no obvious pattern of up- or downregulation associated 
with CRP Mt binding at sites distant from the TSS. 



DISCUSSION 

The 191 CRP Mt -binding sites identified by ChlP-seq anal- 
ysis have approximately equal distribution between intra- 
genic and intergenic locations across the genome of M. 
tuberculosis. This is similar to the distribution recently 
reported for EspR-binding sites (7), and is intermediate 
between the dominant upstream preference identified by 
antibody-based ChIP with the 'classical' DosR transcrip- 
tion factor (4) and the predominantly intragenic distribu- 
tion of NAP Lsr2 (8). Using fold-enrichment of bound se- 
quences as a surrogate measure, there was no evidence of 
any difference in the affinity of CRP Mt binding to intergenic 
as compared to intragenic sites. The antibody-based pull- 
down strategy we used to identify the binding profile of the 
native protein did not result in detection of a background 
pattern of low affinity sites as described for E. coli CRP (12), 
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Table 2. Differential expression of CBS genes during exponential growth and nutrient starvation 



gene 




CRP to TSS 




Exponential 




Starved 


log2 Fold Change P adj 


log2 Fold Chang 


^e P adj 


Rv0046c 


inol 


500 


-1.22 


0.000 


-1.25 


0.003 


Rv0169 


mcelA 


500 


0.84 


0.005 


2.36 


0.000 


Rv0469 


umaA 


500 


-0.71 


0.020 


-0.99 


0.043 


Rvl660 


pkslO 


500 


0.87 


0.036 


1.02 


0.044 


Rv2145c 


wag31 


500 


-0.63 


0.035 


-1.35 


0.002 


Rv2189c 


Rv2189c 


500 


1.67 


0.000 


4.01 


0.000 


Rv2200c 


ctaC 


500 


-2.12 


0.000 


-3.08 


0.000 


Rv3629c 


Rv3629c 


500 


1.31 


0.001 


1.14 


0.039 


Rv0054 


ssb 


475.5 


-1.02 


0.006 


-1.67 


0.000 


Rv3801c 


fadD32 


159.5 


-1.37 


0.000 


-1.75 


0.000 


Rv3680 


Rv3680 


119.5 


0.91 


0.003 


1.51 


0.000 


Rv2990c 


Rv2990c 


42.5 


-1.89 


0.000 


-1.67 


0.000 


Rv0104 


Rv0104 


6.5 


1.10 


0.002 


1.50 


0.002 


Rv0167 


yrbElA 


3.5 


0.97 


0.004 


2.10 


0.000 


Rvl230c 


Rv 1230c 


3.5 


0.72 


0.039 


1.24 


0.005 


Rv0655 


mkl 


-0.5 


-0.88 


0.002 


-1.32 


0.002 


Rv2107 


PE22 


-44.5 


-1.63 


0.000 


-2.25 


0.005 


Rvl592c a 


Rv 1592c 


-54.5 


-1.31 


0.000 


-1.82 


0.000 


Rv3219 b 


whiBl 


-57.5 


-1.13 


0.000 


-1.39 


0.001 


Rvl057 


Rvl057 


-59.5 


-4.21 


0.000 


-2.89 


0.000 


KVU'40Z 


T>\rf\A^l 
KVU'+JZ 


— /D.J 


1.09 


0.009 


1 

1 .JO 


n on 

U.U1 / 


Rv0885 


Rv0885 


-88.5 


-0.91 


0.004 


-1.40 


0.001 


Rv3053c 


nrdH 


-128.5 


-1.41 


0.000 


-2.44 


0.000 


Rv0467 


icll 


-242.5 


-1.68 


0.000 


-2.57 


0.000 


Rv2173 


idsA2 


-246.5 


1.00 


0.004 


1.65 


0.000 


Rvl379 


pyrR 


-365.5 


1.29 


0.000 


1.95 


0.000 


Rv2703 


sigA 


-407.5 


-1.64 


0.003 


-1.38 


0.007 


Rv2846c 


efpA 


-416.5 


-0.93 


0.001 


-2.67 


0.000 


Rv3616c 


espA 


-983.5 


-0.65 


0.028 


-1.35 


0.002 



CBS genes with concordant patterns of differential expression in the crp deletion mutant under both growth conditions are listed. According to the 
canonical model for CRP Mt regulation, it is anticipated that genes having a CBS overlapping the transcription start site will show increased expression in 
the absence of CRP Ml (highlighted in bold), while genes with a CBS in the upstream region will show decreased expression (underlined). CBSs between 
—200 and +20 bp with respect to the transcriptional start site were considered for highlighting differences in expression. 
Additional CBS at -101.5. 
Additional CBS at -57.5 and 47.5. 



though results from high-throughput screening of tagged 
recombinant proteins reveal this to be a common feature 
of ChIP analysis of M. tuberculosis transcription factors 
(9). Whilst most transcription factors are present at low 
or undetectable abundance in proteome profiles measured 
by shotgun mass spectrometry, CRP Mt resembles EspR 
and MtrA in having a copy number approaching that of 
NAPs HupB, MihF and Lsr2 (46). The number of CRP Ml 
molecules present in the cell is in several fold excess of that 
required to saturate binding to the CBSs detected by ChlP- 
seq. 

Taking advantage of a comprehensive map of M. tuber- 
culosis transcriptional start sites, we were able to calculate 
the distance between each CRP Mt -binding site and the clos- 
est TSS. This revealed a pattern of clustering in upstream 
and downstream regions, suggesting a parallel with canoni- 
cal E. coli CRP regulation, in which CRP binding upstream 
of the TSS can activate expression, and CRP binding close 
to the TSS is inhibitory. Data generated by combined ChlP- 
seq/TSS mapping reproduced previous studies of validated 
CRP Mt regulatory targets, confirming an upstream bind- 
ing site consistent with activation of serC and Rv0885 (13), 
and a binding site occluding the TSS of CRP Mt -repressed 
frdA (17). Alignment of ChlP-seq and TSS data sets also 
reproduced the more complex situation of tandem activat- 



ing and inhibitory binding sites in the whiBl promoter (21). 
Additional potential internal regulatory features include 
identification of a CBS shared by mmpS4, implicated in 
siderophore export (47), and its potential regulator Rv0452, 
and CBSs upstream of both the PE13/PPE18 operon and 
its Rv0485 regulator (48). 

Deletion of the crp gene had a pronounced impact on the 
growth of M. tuberculosis and on transcription profiles mea- 
sured in exponential and starved cultures. The effect of crp 
deletion on expression of the set of CBS genes was simi- 
lar to its effect on the overall transcriptome, with ~20% of 
the genes showing increased or decreased abundance. Con- 
sistent with previous publications (13-14,17,21) our results 
showed that M. tuberculosis CRP Mt can act as a 'classi- 
cal' transcription factor in reducing or enhancing expres- 
sion (dependent on spacing of CBS and TSS), but that this 
paradigm operates at only a subset of CRP Mt -binding sites. 
Two models can be proposed to reconcile the limited direct 
effect of crp deletion on expression of CBS genes with the 
extensive impact of crp deletion on the total transcriptome 
and growth phenotype. It can be envisaged that the primary 
transcription changes are amplified through their effect on 
secondary networks and co-regulation with additional tran- 
scription factors. This model is illustrated by recent map- 
ping of the multiple regulatory layers that contribute to the 
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Figure 3. Transcription factor binding to the promoter region of Rvl057. 
(A) Artemis traces showing the binding of CRP (blue) to the AT-rich region 
upstream of Rvl057. TSS mapping (green) according to (36). Traces record 
the normalized number of mapped reads and the maximum normalized 
read count is indicated. (B) The promoter region of Rvl057 has several 
binding sites for other transcription factors suggesting that CRP M1 may 
act in concert with other regulators. Transcription factor binding sites are 
shown as coloured boxes [black: LexA (5); red: TrcR (41); green: MprA 
(44); pink: EspR (7); turquoise: CRP). The arrow denotes the TSS (36). 
Genome coordinates indicate the start of LexA-binding site ( 1 178795) (5), 
Rvl057 TSS (36) and translational start site (1179396). 
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Figure 4. CBSs during nutrient starvation. (A) Enrichment of CRP Mt 
binding to DNA during exponential and starvation phase. Box plots in- 
dicating median (horizontal line), interquartile range (box) and minimum 
and maximum values (whiskers) of the enrichment factor (maxT) of the 
151 shared peaks between exponential growth and the starvation model 
(Mann-Whitney test; ** indicates significant difference between values P 
< 0.01). (B) cAMP concentration in the starvation model. 



overall function of the FNR transcription factor in E. coli 
(49). In an alternative model, it can be envisaged that rather 
than acting as primary determinants, transcription factors 
such as CRP Mt play a complementary role within a regu- 
latory environment dominated by global physiological con- 
trol mechanisms (50). Deletion of CRP Mt may have an influ- 
ence on global physiology in addition to its localized effect 
on individual genes. CBS genes that did not show expression 
changes in the crp-deletion strain could be due to a specific 
role of CRP Mt in regulating transcription states that will be 
dependent on environmental conditions or a combined reg- 
ulation in conjunction with other transcription factors. This 
is in agreement with the findings of Hollands et ah, who 
observed no CRP-dependent regulation in E. coli at several 
promoters containing high-affinity DNA-binding sites for 
CRP (51). They also suggest that there may be some spe- 
cific conditions where CRP-dependent regulation becomes 
important. Alternatively, they suggest that this may be due 
to CRP playing a role as an NAP, thus influencing the dy- 
namic spatial arrangement of the chromosome (51). 

There is a need for further analysis of the effects of cAMP 
on CRP Mt binding and of the available intracellular concen- 
tration of cAMP in different growth states. In addition to 
its possible role in CRP Mt regulation (52), cAMP binds to 
other transcription factors (53) and has important allosteric 
effects on enzyme function (54). However, the role of cAMP 
in CRP Mt regulation remains unclear. Whilst some struc- 
tural studies demonstrate a conformational change asso- 
ciated with binding of cAMP (55), biochemical analysis 
shows that this has little or no effect on binding to DNA 
(21). Therefore, based on the current understanding of pro- 
tein allostery (56), it is likely that CRP Mt is a 'dynamic' 
protein (more dynamic than E. coli CRP) that can read- 
ily switch to a conformation that promotes DNA interac- 
tion, even when cAMP is not bound. cAMP binding to CRP 
may increase the fraction of protein in the DNA-binding 
state, via conformational selection, which is reflected in 'en- 
hanced' DNA binding. By lowering cAMP levels during 
starvation, the fraction of CRP in the DNA-binding con- 
formation is lowered, thereby reducing occupancy at CRP- 
binding sites. This significant DNA binding by CRP M| in 
the absence of cAMP suggests an evolutionary adaptation 
given the high levels of cAMP seen in mycobacteria. There- 
fore CRP Mt could have evolved to act as a DNA-coating 
protein or a recruiting protein for other transcription fac- 
tors or RNA polymerase. CRP Mt levels are quite high in 
the cell, so any decrease in cAMP levels could also be off- 
set by the high levels of CRP M| present at any given time. 
However, cAMP binding to CRP resulting in the sustained 
occupancy of a CRP site may influence the interaction of 
CRP Mt with other transcriptional regulators and chromo- 
some organizing proteins. 
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